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INTRODUCTION 


The least-squares method for curve fitting, or defining a curve that best ap- 
proximates a data set, is well known and is used in almost every technical discipline. 
However , there has always been the recurrent problem of how to efficiently choose 
the degree of the polynomial to obtain a fit that is at least moderately good without 
any prior information concerning the nature of the data. Although different types 
of data are approached by various methods, oscillatory (high-frequency) data are 
usually smoothed, while nonoscillatory (low-frequency) data are usually approximated. 

The algorithm discussed in this paper deals primarily with the efficiency of 
choosing the degree of the polynomial as it relates to nonoscillatory data but does 
not neglect the smoothing of high-frequency data . 
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degree of approximating polynomial 

NP 


total number of peaks 
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total number of data points 
weighting function 
discrete data points 

arbitrary tolerance values 
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DISCUSSION 


The least- squares approach to curve fitting using a polynomial as a model is 
based on the minimization of the sum of the squares of the differences between the 
data and a polynomial of degree m evaluated at corresponding given observations. 
(Note that the use of least squares assumes the errors in the data to be normally 
distributed with a mean of zero.) The function to be minimized is 



where m is the degree of the model , x . and y . are discrete data points , a^. is the 

coefficient to be determined, w(x.) is the weighting factor, and TP is the number 

of data points (ref. 1) . If the data can be accurately represented by a polynomial, 
then CT m = F ( a o’ a i a m^ ten ^s to zero as m approaches L, where L is the degree 

of the approximating polynomial (ref. 1) . However, in practice, low order polyno- 
mials are preferred for curve fitting, and thus, it is sufficient to notice the magni- 
tude of change in a as m increases . In other words , if 

m 


|(a /a ) - 1 
r m m+r 


< e. 


< 1 


a polynomial of degree m + 1 is considered sufficient (ref. 1) . (An additional 
restraint , 




< e. 


where R represents the maximum percentage of error incurred over the entire col- 
lection of points , would be desirable as it would provide an alternate criterion 
for evaluating the quality of the fit.) One of the least desirable features of this 
conventional approach is the need to examine o for 1 < m < L when the data are 
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nonoscillatory. This procedure results in a needless waste of computer, time in 
converging to L if L >2. This problem is addressed in the following discussion. 

Let K be a constant such that 0 < K < 1. The data are considered to be oscil- 
latory if one of the following conditions is true: (1) NP/TP > K where TP represents 
the total number of data points and NP is the total number of peaks, or (2) NP > I 
where I is a constant integer (for example , I = 10) . A peak is considered i 
to have occurred if 

<yj - y i+ iKy i+ i - y it2 ) < 0 

where x . < x J+1 < x. +2 . If a polynomial , P (x) , is sought to smooth data that is 

considered oscillatory based on condition (1) or (2) above, then by observing 
a as m increases and by applying the above criterion for quality of fit, it can be 

observed that |( a m approaches zero for small values of m almost with- 

out exception . Of course by choosing a large enough value of m , one could match 
the data closely , but the smoothing properties of the curve would be sacrificed . 
Moreover , the degree of the polynomial would be sufficiently large to make the 
computational time prohibitive. Therefore, this technique selects polynomials of 
small degree for oscillatory data and, hence, tends to smooth the data. 

If, on the other hand, the data are determined to be nonoscillatory and are 
assumed to have no wild points , the following analysis applies . 

Let NP be the total number of peaks and P (x) be the desired polynomial that 
approximates the data . From the definition of a peak and because a polynomial and 
'its derivatives are continuous , there exists a point (x Q , y Q ) , where x. < x Q < x^ +1 , 

such that P* (x 0 ) = 0. That is , P (x Q ) is a local extremum or peak for P (x) . Hence, 

if there are NP local extrema, there are at least NP real zeros for P' (x) . However, 
this implies that the degree of P'(x) must be at least NP , which implies that the 
degree of P (x) must be at least NP + 1 . 

This argument can be extended to second derivatives. That is, 

[ (y i +2 - y m )/Cx i +2 ‘ Vi> - (y i + i - y i )/(x i + i - x i>] [ (y i+ i - V'friu - V - (y i - y i-i )/(x i - x i-i>] < 0 

implies that there exists a point x^, where x. < x Q < x^ +1> such that P"(x Q ) = 0, 

where P(x) is the approximating polynomial. Let Nx be the sum of all such 
occurrences. It is then sufficient to examine the value jl - (o /a m + l>l ,or 

1 < NC < m < L where L is the degree of the approximating polynomial and 

NC = maximum [(NP + 1) , (Nx + 2)] . This approach is an improvement over the 

conventional methods, and the improvement should increase as the degree of the 

polynomial required increases . Finally , an additional advantage of this technique 

is the increased probability of obtaining a good fit . When using the conventional 

method, the ratio a /a . . may approach 1 with m small and a large, which 

m m+l m 

would cause the iteration to terminate too soon. This obviously would produce 
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a rather. poor approximating polynomial when the data are nonoscillatory but have 
several extrema. - 

These observations indicate that both accuracy and efficiency can be improved 
with the proposed method . Illustrative examples are given in the appendix . 


CONCLUDING REMARKS 


While the approach presented in this paper offers nothing new for oscillatory 
data, it defines a criterion that discriminates between oscillatory and nonoscilla- 
tory data and attempts to handle both without loss of generality. The approach 
eliminates the need to examine the residuals of polynomials with degrees from 1 
to the degree of the approximating polynomial for nonoscillatory data without sac- 
rificing the performance of the least- squares method as it relates to oscillatory 
data . It also increases the probability of selecting a good approximating poly- 
nomial for nonoscillatory data. 

NASA Dryden Flight Research Center 
Edwards, Calif. , November 9, 1977 ' 

APPENDIX-ILLUSTRATIVE EXAMPLES 


Figures 1 and 2 are examples of oscillatory and nonoscillatory data that are 
smoothed and approximated using the method presented in the body of this report . 
These examples show the generality of the proposed method. Figure 3 shows the 
weakness of the conventional method . From these examples , it is clear that the 
proposed method is an improvement over the conventional technique for automa- 
tically choosing the degree of a polynomial for curve fitting. 
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Figure 3. Nonoscillatory data approximated with conventional method. 
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